Search CORE

1,185 research outputs found

Language models and probability of relevance

Author: Hiemstra D.
Robertson S.E.
Publication venue: Carnegie Mellon University
Publication date: 01/01/2001
Field of study

this document; the equation then represents the probability that the document that the user had in mind was in fact this one. Hiemstra [1] gives the same equation a slightly di#erent justification. The basic assumption is the same (the user is assumed to have a specific document in mind and to generate the query on the basis of this document), but instead of smoothing, the user is assumed to assign a binary importance value to each term position in the query. An important term-position is filled with a term from the document; a non-important one is filled with a general language term. If we define # i = P(term position i is important), then we get P (D, T 1 , T 2 , . . . , T n ) = P (D) n # i=1 ((1 - # i )P (T i ) +&lt

CiteSeerX

University of Twente Research Information

Relevance feedback for best match term weighting algorithms in information retrieval

Author: Hiemstra D.
Robertson S.E.
Publication venue: European Research Consortium for Informatics and Mathematics
Publication date: 01/01/2001
Field of study

Personalisation in full text retrieval or full text filtering implies reweighting of the query terms based on some explicit or implicit feedback from the user. Relevance feedback inputs the user's judgements on previously retrieved documents to construct a personalised query or user profile. This paper studies relevance feedback within two probabilistic models of information retrieval: the first based on statistical language models and the second based on the binary independence probabilistic model. The paper shows the resemblance of the approaches to relevance feedback of these models, introduces new approaches to relevance feedback for both models, and evaluates the new relevance feedback algorithms on the TREC collection. The paper shows that there are no significant differences between simple and sophisticated approaches to relevance feedback

CiteSeerX

Radboud Repository

University of Twente Research Information

Interleukin-18

Author: Gracie J.A.
McInnes I.B.
Robertson S.E.
Publication venue: 'Society for Leukocyte Biology'
Publication date: 01/02/2003
Field of study

Interleukin-18 (IL-18), a recently described member of the IL-1 cytokine superfamily, is now recognized as an important regulator of innate and acquired immune responses. IL-18 is expressed at sites of chronic inflammation, in autoimmune diseases, in a variety of cancers, and in the context of numerous infectious diseases. This short review will describe the basic biology of IL-18 and thereafter address its potential effector and regulatory role in several human disease states including autoimmunity and infection. IL-18, previously known as interferon-gamma (IFN-gamma)-inducing factor, was identified as an endotoxin-induced serum factor that stimulated IFN-gamma production by murine splenocytes [1 ]. IL-18 was cloned from a murine liver cell cDNA library generated from animals primed with heat-killed Propionibacterium acnes and subsequently challenged with lipopolysaccharide [2 ]. Nucleotide sequencing of murine IL-18 predicted a precursor polypeptide of 192 amino acids lacking a conventional signal peptide and a mature protein of 157 amino acids. Subsequent cloning of human IL-18 cDNA revealed 65% homology with murine IL-18 [3] and showed that both contain an unusual leader sequence consisting of 35 amino acids at their N terminus

Enlighten

Disambiguation strategies for cross-language information retrieval

Author: D. Harman
G. Salton
S.E. Robertson
Publication venue: Springer Verlag
Publication date: 01/01/1999
Field of study

This paper gives an overview of tools and methods for Cross-Language Information Retrieval (CLIR) that are developed within the Twenty-One project. The tools and methods are evaluated with the TREC CLIR task document collection using Dutch queries on the English document base. The main issue addressed here is an evaluation of two approaches to disambiguation. The underlying question is whether a lot of effort should be put in finding the correct translation for each query term before searching, or whether searching with more than one possible translation leads to better results? The experimental study suggests that the quality of search methods is more important than the quality of disambiguation methods. Good retrieval methods are able to disambiguate translated queries implicitly during searching

CiteSeerX

Crossref

Radboud Repository

University of Twente Research Information

Query Expansion Using PRF-CBD Approach for Documents Retrieval

Author: A. Kaczmarek
C.D. Manning
S. Robertson
S.E. Robertson
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Crossref

The quantum probability ranking principle for information retrieval

Author: C.X. Zhai
H. Chen
M. Eisenberg
M.E. Maron
R.P. Feynman
S.E. Robertson
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

While the Probability Ranking Principle for Information Retrieval provides the basis for formal models, it makes a very strong assumption regarding the dependence between documents. However, it has been observed that in real situations this assumption does not always hold. In this paper we propose a reformulation of the Probability Ranking Principle based on quantum theory. Quantum probability theory naturally includes interference effects between events. We posit that this interference captures the dependency between the judgement of document relevance. The outcome is a more sophisticated principle, the Quantum Probability Ranking Principle, that provides a more sensitive ranking which caters for interference/dependence between documents’ relevanc

Crossref

Queensland University of Technology ePrints Archive

Enlighten

Synchronous collaborative information retrieval: techniques and evaluation

Author: A.F. Smeaton
I.J. Aalbersberg
J. Pickens
M.R. Morris
N. Craswell
R.W. White
S.E. Robertson
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Synchronous Collaborative Information Retrieval refers to systems that support multiple users searching together at the same time in order to satisfy a shared information need. To date most SCIR systems have focussed on providing various awareness tools in order to enable collaborating users to coordinate the search task. However, requiring users to both search and coordinate the group activity may prove too demanding. On the other hand without effective coordination policies the group search may not be effective. In this paper we propose and evaluate novel system-mediated techniques for coordinating a group search. These techniques allow for an effective division of labour across the group whereby each group member can explore a subset of the search space.We also propose and evaluate techniques to support automated sharing of knowledge across searchers in SCIR, through novel collaborative and complementary relevance feedback techniques. In order to evaluate these techniques, we propose a framework for SCIR evaluation based on simulations. To populate these simulations we extract data from TREC interactive search logs. This work represent the first simulations of SCIR to date and the first such use of this TREC data

CiteSeerX

Crossref

Irish Universities

DCU Online Research Access Service

Towards a Better Understanding of the Relationship between Probabilistic Models in IR

Author: C. Zhai
C. Zhai
C. Zhai
C.D. Manning
D.W. Hosmer
F. Crestani
J. Lafferty
J.M. Ponte
K. Spärck-Jones
N. Fuhr
R.W.P. Luk
S.E. Robertson
S.E. Robertson
S.E. Robertson
S.E. Robertson
T. Roelleke
T. Roelleke
V. Lavrenko
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Probability of relevance (PR) models are generally assumed to implement the Probability Ranking Principle (PRP) of IR, and recent publications claim that PR models and language models are similar. However, a careful analysis reveals two gaps in the chain of reasoning behind this statement. First, the PRP considers the relevance of particular documents, whereas PR models consider the relevance of any query-document pair. Second, unlike PR models, language models consider draws of terms and documents. We bridge the first gap by showing how the probability measure of PR models can be used to define the probabilistic model of the PRP. Furthermore, we argue that given the differences between PR models and language models, the second gap cannot be bridged at the probabilistic model level. We instead define a new PR model based on logistic regression, which has a similar score function to the one of the query likelihood model. The performance of both models is strongly correlated, hence providing a bridge for the second gap at the functional and ranking level. Understanding language models in relation with logistic regression models opens ample new research directions which we propose as future work

Crossref

Ghent University Academic Bibliography

Recommended from our members

Ontology Based Query Expansion with a Probabilistic Retrieval Model

Author: J. Bhogal
K. Sparck-Jones
S.E. Robertson
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

This paper examines the use of ontologies for defining query context. The information retrieval system used is based on the probabilistic retrieval model. We extend the use of relevance feedback (RFB) and pseudo-relevance feedback (PF) query expansion techniques using information from a news domain ontology. The aim is to assess the impact of the ontology on the query expansion results with respect to recall and precision. We also tested the results for varying the relevance feedback parameters (number of terms or number of documents). The factors which influence the success of ontology based query expansion are outlined. Our findings show that ontology based query expansion has had mixed success. The use of the ontology has vastly increased the number of relevant documents retrieved, however, we conclude that for both types of query expansion, the PF results are better than the RFB results

City Research Online

Crossref

Birmingham City University Open Access Repository

BCU Open Access

Recommended from our members

Parallel methods for the update of partitioned inverted files

Author: A. MacFarlane
David Bawden
J.A. McCann
S.E. Robertson
Publication venue: 'Emerald'
Publication date: 12/07/2007
Field of study

Purpose – An issue which tends to be ignored in information retrieval is the issue of updating inverted files. This is largely because inverted files were devised to provide fast query service, and much work has been done with the emphasis strongly on queries. In this paper we study the effect of using parallel methods for the update of inverted files in order to reduce costs, by looking at two types of partitioning for inverted files: document identifier and term identifier. Design/methodology/approach – Raw update service and update with query service are studied with these partitioning schemes using an incremental update strategy. We use standard measures used in parallel computing such as speedup to examine the computing results and also the costs of reorganising indexes while servicing transactions. Findings – Empirical results show that for both transaction processing and index reorganisation the document identifier method is superior. However, there is evidence that the term identifier partitioning method could be useful in a concurrent transaction processing context. Practical implications – There is an increasing need to service updates which is now becoming a requirement of inverted files (for dynamic collections such as the Web), demonstrating that a shift in requirements of inverted file maintenance is needed from the past. Originality/value – The paper is of value to database administrators who manage large-scale and dynamic text collections, and who need to use parallel computing to implement their text retrieval services

City Research Online

Crossref